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Abstract 

Geomagnetic storms play a critical role in space weather physics with the potential 
for far reaching economic impacts including power grid outages, air traffic re-routing, 
satellite damage and GPS disruption. The LFM-MIX is a state-of-the-art coupled 
magnetospheric-ionospheric model capable of simulating geomagnetic storms. Imbed- 
ded in this model are physical equations for turning the magnetohydrodynamic state 
parameters into energy and flux of electrons entering the ionosphere, involving a set 
of input parameters. The exact values of these input parameters in the model are 
unknown, and we seek to quantify the uncertainty about these parameters when model 
output is compared to observations. The model is available at different fidelities: a 
lower fidelity which is faster to run, and a higher fidelity but more computationally 
intense version. Model output and observational data are large spatio-temporal sys- 
tems; the traditional design and analysis of computer experiments is unable to cope 
with such large datasets that involve multiple fidelities of model output. We develop an 
approach to this inverse problem for large spatio-temporal datasets that incorporates 
two different versions of the physical model. After an initial design, we propose a se- 
quential design based on expected improvement. For the LFM-MIX, the additional run 
suggested by expected improvement diminishes posterior uncertainty by ruling out a 
posterior mode and shrinking the width of the posterior distribution. We also illustrate 
our approach using the Lorenz '96 system of equations for a simplified atmosphere, us- 
ing known input parameters. For the Lorenz '96 system, after performing sequential 
runs based on expected improvement, the posterior mode converges to the true value 
and the posterior variability is reduced. 

Keywords: Computer Experiments; Expected Improvement; Geomagnetic 
Storm; Inverse Problem; Lorenz '96; Model Fidelity; Sequential Design; 
Uncertainty Quantification 
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1 Introduction 



The Lyon-Fedder-Mobarry (LFM) magnetohydrodynamical model, coupled with the MIX 
model for the ionosphere, creating the coupled LFM-MI X is a state-of-the-art physica l mode l 



for geomagnetic storms occurring in near-Earth space <\Lyon, Fedder, and MobarryY 120041 ) . 
The LFM-MIX is used to explore and understand the physics of space weather, and is a 
crucial part of an ongoing effort to build a space weather forecasting system. The LFM-MIX 
contains three input parameters embedded in physical equations for turning the LFM state 



parameters into energy and flux dWiltberger et all 120091 ) . Exact values of these input param- 



eters are unknown, and our goal is to quantify the uncertainty surrounding these parameters 
when model output is compared to an observed storm, posing substantial statistical chal- 
lenges including large spatio-temporal systems of observations and model output, as well as 
the need to incorporate multiple versions of the LFM-MIX. 

1.1 Geomagnetic Storms 

Geomagnetic storms play an increasingly important role in society. A recent National 
Academy of Sciences report outlined past occurrences of geomagnetic storm disruptions, and 
discussed the importance of preparedness in the future when the Sun returns to its solar peak 



in 20 13, which leads to larger and more frequent geomagnetic storms ([National Research Council 



20081 ). Intense geomagnetic storms adversely affect satellites and can have significant asso- 
ciated costs; in 1994 a Canadian telecommunication satellite experienced an outage due to a 
strong storm, and recovery of the satellite cost between $50 million and $70 million. Large 
storms can interact with electric grids; a superstorm in March 1989 shut off electricity to 
the province of Quebec, Canada for nine hours. Global position systems (GPS) and commu- 
nication systems are affected by large storms; the Federal Aviation Administration's Wide 
Area Augmentation System (WAAS) is a GPS location system for aircraft, whose verti- 
cal navigation system was shut down for approximately 30 hours in 2003 due to a series 
of powerful storms. As society has become increasingly reliant on electricity and satellite 
communication, the potential devastating effects of geomagnetic storms are magnified. 

Geomagnetic storms are caused by the interaction of the plasma and magnetic field of 
the Sun interacting with Earth's magnetic field. Coronal Mass Ejections (CMEs) from the 
Sun release massive twisted magnetic field configurations that can deposit substantial energy 
in the region of near-Earth space known as the magnetosphere. The energy is stored for a 
while, and then is released in an explosive fashion, sending particles down magnetic field 
lines into the ionosphere causing the aurora borealis or northern lights. 

1.2 Computer Experiments 

In the computer experiments literature, tuning of physical model parameters to observa- 
tions is called an inverse problem, and is sometimes referred to as a calibration problem 
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dSantner. Williams, and Notzl 120031 ; I Tarantolai 120051 ). Two features of our setup make the 
traditional approach to design and analysis of computer experiments infeasible. First, obser- 
vational data and computer model output are highly multivariate; modeling m odel output 



and o bservati ons as realizations from a Gau ssia n process (e.g. as p opularized by \Sacks et al 



19891 . see also \Kennedy and O 'Haaan\\200l\ and Wiadon et al\\2004 ) is impractical due to the 



dimensionality of the covariance matrix. The second issue is that the LFM-MIX is available 
at multiple fidelities. In particular, solving the physical equations making up the LFM at a 
lower resolution yields model output that is jointly faster to calculate but does not match up 
as well with observations, a version we call low fidelity. Alternatively, at a higher resolution 
the LFM yields output whose spatial features are more consistent with observational data, 
but which takes substantially longer to run (approximately an eight-fold increase in compu- 
tation time), a version we call high fidelity. We aim to exploit a statistical link between the 
model fidelities, thereby allowing us to explore the input parameter space using the cheaper 
low fidelity version, while performing fewer runs of the high fidelity version. 

The problem of high dimensional observations and model output h as rece ntly become 



acknowledged in the computer experiments literature. \Hiadon et al\ (l2008al ) recommend 
decomposing model output and model bias terms as weighted sums of orthogonal basis func- 
tions. The weights on the basis functions are then modeled as Gaussian processes. Indeed, 
the notion of an orthogonal decomposition has been furth er used by var i ous authors to re- 



duce t he high dimens i onalit y of vector valued model output \Higdon et all l2008bl ; I Wilkinson 



20101 ). \Pratola et all (120131 ) introduce a fast approach to calibration for large complex com- 



puter models. In the geophysical sciences, model output is often spatio-t emporal in nature, 
which typically gives rise to large datasets. What, Haran, and Goes\ (120101 ) develop a calibra- 
tion approach for multivariate spatial data, modeling the model output as a Gaussian process 
across space and input setting, exploiting a separable covariance structure. Our model and 
data also evolve across time , and t he presence of multiple fidelities of model output challenge 
the approach of What et all (120101 ). 

Accounting for multiple versions of model o utput is a second problem tha t has recently 



arisen in the computer experiments literature. \Kennedy and O'Haaanl (120001 ) introduce an 
autoregressive Markov property for multiple fidelities of model output, modeling the innova- 
tion as a Gaussian process. While their idea is extended to a continuum of model fidelities , 
a crucial and restrictive assumption is that the model output is scalar. \Qian et all (120061 ) 
develop an appr oach to combining two levels of fidelity that is extended to a Bayesian hierar- 
chical setting by \Qian and Ww (120081 ) . The idea is to decompose the high fidelity output as 
a regre ssion on the low fidelity version, and m odel the interce p t and slope as Gaussian pro- 



cesses. \Forrester, Sobester, and Keanel (120071 ) and lLe GratieA (120121 ) recommend co-kriging 
for multiple fidelities of output, but do not consider the issue of large datasets. We exploit 
similar ideas to these authors in our construction, although we must take care to reduce 
the dimensionality of the data, as both versions of the LFM-MIX are highly multivariate. 
It is worth mentioning that there is some literature on emulators for multivariate computer 
models, but our current interest is not in emulation, but rather parameter identification 
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(IRougieA 12003 Wouaier et all 120091 ). 



Herein we develop methodology for quantifying the uncertainty about tuning parameters 
for high-dimensional spatio-temporal observations and physical model with two levels of fi- 
delity. We exploit an empirical orthogonal function (EOF) decomposition of the low fidelity 
spatial field, and an EOF decomposition of a discrepancy function li nking the low and high 



fidelity versions of the computer model. Our work generalizes that o uKennedy and O'Hagan 



( 1200 if ) to account for large spatio-tempo r al data sets. The techniques introduced below also 
generalize the approach of \Higdon et al\ (j2008al ) to account for two levels of model fidelity. 
The methodology is illustrated on the LFM-MIX and the Lorenz '96 system of equations 



governing a simplified atmosphere uLorenzl Il996l 120051 ). where we know true values of the 



input parameters. For both models, after initial p arameter estimation, we prop o se a s equen- 
tial design based on expected improvement (EI; I Jones, Schonlau, and Welchl Il998l). Our 
development of expected improvement generalizes the approach of Uones et all (119981 ) to 
sequential design for spatio-temporal data. 



2 LFM-MIX and Observations 

The physical model we examine is a coupled magnetospheric-ionospheric model for geomag- 
netic storms in near-Earth space. The magnetohydrodynamical solver is the Lyon-Fedder- 
Mobarry (LFM) model which consists of five physical equations defining the spatial and 
temporal evolution of the interaction between the solar wind and Earth's magnetosphere. 
These five magnetohydrodynamic equations must be solved numeri cally by discretiz i ng th e 



equations to a spatio-temporal grid, using the partial donor method ( \Wiltberger et al.[\2004i ). 
There is a coarsest grid on which the equations are solved that still yields physically mean- 
ingful model output at a reduced computational cost. Discretizing the equations on a finer 
grid by doubling the number of spatio-temporal points (in the polar and azimuthal angle 
directions, as well as at a finer temporal scale) results in higher fidelity model output, but 
substantially increases the computational time required to complete model runs. Intuitively, 
doubling the grid density in three directions results in a 2 3 = 8-fold increase in computation 
time; in practice, the higher resolution version is an approximately 5.5 to 6-fold increase in 
computation time as compared to the lower resolution. As boundary conditions, the LFM 
requires solar wind, initial strength of the magnetic field, and the level of ultraviolet light 
from the Sun. For any single geomagnetic storm, these boundary conditions are fixed and 
are not considered input parameters. 

The LFM solver is coupled to an ionospheric model, the MIX, forming the fully coupled 
LFM-MIX. The MIX model requires information about the energy and number flux of the 
electrons precipitating into the ionosphere along magnetic field lines. Three physical equa- 
tions define energy and number flux inputs. The equations relate initial energy eo, sound 
speed cf, number flux F , the density of innermost cells of the magnetospheric grid p, the 
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field aligned electrical potential energy difference en and upward field aligned current Jn as 



e = ac 2 s , F = f3p^ } e\\ = , (1) 



sec 



Wiltberger et al\ (120091 ) for further discussion. An important quantity called total energy 
is defined as eo + e\\. Here, a, {3 and R are tuning factors that are included to account 
for physical processes outside the scope of the LFM. The exact values of these parameters 
are unknown, and we seek to quantify the uncertainty about these parameters when model 
output is compared to observations. The parameter a accounts for the effects of calculating 
electron temperature from the single fluid temperature, (3 is included to adjust for possible 
plasma anisotropy and controls a loss filling cone, while R allows scaling of the parallel 
potential drop based on the sign of the current and accounts for the possibility of being 
outside the regime of the scaling. Notice the total energy is a nonlinear function of a and R, 
while flux is a function of /3; later when we develop the statistical model, we take advantage 
of these functional relationships. 

Regardless of the resolution of the LFM input, the MIX coupler output is always on 
the same spatio-temporal resolution. Hence, unlike uncoupled models, the low and high 
resolution LFM-MIX output is co-located, and we will refer to the low resolution output 
as low fidelity, and the high resolution output as high fidelity. This allows us to directly 
compute the scalar difference between the two fidelities without regridding. Model output 
from the LFM-MIX is a bivariate spatio-temporal field, for the variables of energy (in keV) 
and flux (in -A - ) • Developing a bivariate spatio-temporal model is beyond the scope of the 
current manuscript, and we focus on uncertainty estimation using only the energy model 
output. 

The observational dataset we examine is a bivariate spatio-temporal field observed during 
a January 10, 1997, geomagnetic storm from 2pm to 4pm UTC, with 18 equally spaced time 
points. The storm was observed by the Ultraviolet Imager on the Polar satellite, deriving 
the two variables of energy (in keV) and energy times flux (in simultaneously. The 
observations were recorded on a grid of 170 locations, leading to a dataset of 6120 correlated 
observations. The LFM-MIX model output is on a grid of 1656 locations such that the 
observational grid is a subset of the model output. 



3 Parameter Estimation for the LFM-MIX 

We require initial runs of the low and high fidelity model to inform a statistical relationship 
between the two. As our initial experimental design, we run the LFM-MIX at a sampling 
of points in the three dimensional space defined by a G [0,0.5],/? G [0,2.5] and R G [0,0.1], 
which is the hyperrectangle defining physically feasible values of (a,(3,R). 
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3.1 Design 



Using the hyperrectangle [0, 0.5] x [0, 2.5] x [0, 0.1] of values for 9 = (a, , R) , we ran the low fi- 
delity version at 20 sets of input settings based on a space-filling design & Johnson, Moore, and Ylvisaken . 



19901 ). Call this model output L(s,t,9 p ) at location s G M 2 , time t and input setting 



9 P = (a p , (3 P , R p ), p = 1, . . . , 20. We also ran the high fidelity version at a nested, space-filled 
subset of 5 of the original 20. Similar to the low fidelity, call the model output H(s, t, 8 P ), for 
p — 1, ... ,5. Setting up the initial design in such a way that the low and high fidelity ver- 
sions are nested, i.e. run at co-located input parameter settings, yields direct observations of 
the discrepancy H(s,t,9 p ) — L(s,t,9 p ), and assists in developing the statistical relationship 
between the two. If the design were not nested, we would require estimated discrepancies 
H(s,t,9p)—L(s,t,9p) or H(s,t,9 p ) — L(s,t,9 p ) to explore the statistical relationship, thereby 
introducing additional uncertainty. The choice of 20 and 5 runs for the low and high fidelity 
model, respectively, is due to the expensive computational cost of running the LFM-MIX. 
For our study geomagnetic storm, the low fidelity model runs in 16 hours, while the high 
fidelity model requires approximately 84 hours per run on a Linux cluster with 8 processors. 
In total, the initial design took approximately 740 hours to run. Note the benefit of exploit- 
ing the lower fidelity, but faster running version - had we run the high fidelity model on 
the initial design of 20 input settings, the computational time would be approximately 1680 
hours. Hence, the inclusion of the cheaper low fidelity model allows us to reduce the initial 
computational load by about 56%. 



3.2 Statistical Model 



Following an approach popularized by \Kennedy and O'Haaani (120011 ). we suppose there is 
an unknown setting, #o, for which the high fidelity model is an adequate representation of 
reality. In particular, for observations of energy (in keV), Y(s,t), at grid point s and time 
t, we have 



Y(s,t) = H(s,t,9 



+ e(s,t) 



(2) 



where e(s,t) is measurement error, which we assume to b e normally distributed with m ean 
zero and variance r 2 . Our approach slightly differs from \Kennedy and O'Haaam (120011 ) in 
that we do not entertain a model discrepancy term. Our setup is a large-scale in verse 
problem, where model discrepancy is not part of the traditional setup (I Tarantolal , 120051 ) . We 
also point out that we have only one geomagnetic storm, and any model bias term would be 
confounded with the error process e(s,t), without severe simplifying assumptions. 

To fully exploit the information from the low fidelity model, we require a link between 
the coarse model L and the higher fidelity model H, which yields output fields that are more 
consistent with observational data. Specifically, we link the low and high fidelity models 
with an additive discrepancy function 5(s,t,9), where 



H(s,t, 



L(s,t,9) + S(s,t,\ 



(3) 
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Qian and Win (120081 ) considered including a multiplicative discrepancy function as well, 
yielding a decomposition of the form H(s,t,6) = , y(s,t,9)L(s,t,9) + S(s,t,6). For the 
LFM-MIX, both fidelities produce output fields that are of approximately the same magni- 
tude, so we consider only an additive discrepancy function, although the greater flexibility 
of a full multiplicative and additive bias may be useful in other settings. By defining a 
statistical relationship between the low and high fidelity versions of the LFM-MIX, we have 
inherently also developed an emulator for the high fidelity model, based on runs from the 
cheaper low fidelity version, but reassert that our main interest is in the parameters (a, (3, R). 

The model and observations are highly multivariate space-time fields, where, with only one 
storm and 20+5 initial computer model runs, we have 748,260 correlated points (1,656 grid 
locations for the 25 LFM-MIX output runs at 18 time points plus 170 observation lo cations 



over 18 time points). The traditional approach used by lKennedy and O'Haaan fl200ll ) is dial 



lenging to implement for large space-time datasets, as this would require inverting a covari- 
ance matrix of dimension 748,260x748,260. Indeed, in their implementation, the covariance 
matrix wo uld have to b e invert ed at each step of an MCMC procedure. Hence, with spirit 



similar to \Hiadon et al\ ( I2008af ). we use a principal component decomposition approach to 
reduce dimensionality. In particular, we decompose the low resolution model output and dis- 
crepancy function as weighted sums of orthogonal spatial basis functions. In the geop hysical 



scien ces, these spatial functions are known as empirical orthogonal functions (EOFs; WiklA 



2010T ) . In particular, define the spatial vectors X(ij,0 p ) = (L(si,ti,9 p ), . . . , L(s ns ,ti,6 p ))' , 
where n s = 1656 is the total number of grid points of model output, n t = 18 is the number 
of time points, i = 1, . . . , n t and p — 1, . . . , 20. Define the n s x (20 x n t ) dimensional matrix 

X = [X(tx, 9i), X(£ 2 , • • • , X(t nt , #20)] 

so that each column is a spatial vector at a given time point and input setting. The EOFs 
are the columns of U, where we use the singular value decomposition X = UDV, and 
the EOF coefficients are contained in DV. In particular, there are 20 x n t EOFs, each 
of which is length n s . We perform a similar decomposition for the discrepancy process 
S(s,t,6) = H(s,t,9) — L(s,t,9), where there are 5 x n t EOFs, each of which is length n s . 
Our motivation for decomposing the model output as basis functions over space, rather than 
space-time, is driven by exploratory analysis. In particular, the first main spatial mode of 
variation of the low fidelity model output (i.e. the first EOF) exhibits a magnitude with a 
structured form that is similar to the physical equations (p]) and whose magnitude modulates 
up and down as the CME passes over the Earth. This aligns with expert understanding of 
geomagnetic storms, as the effect of the CME passing over the Earth is a period of increasing 
energy and flux, followed by a decay to pre-storm conditions. 

We statistically model the low fidelity model output as a truncated sum of weighted EOFs, 

til 

L(s, t,9) = J2 u Le{s) v e (t, 9) + e l (s, t, 9) (4) 

e=l 
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and similarly the discrepancy function as 



5(a, t,0) = ^2 u 5e (s) w e (t, 9) + e s (s, t, 9) (5) 



e=l 



where the u basis functions are the EOFs contained in the U matrices above, and the v 
and w coefficients are the loadings contained in the DV matrices. We choose sum limits 
of til = 3 and ns = 4 to capture 99% of variability of low fidelity model output, and 90% 
of variability of the discrepancy process, respectively. To capture 99% of variability for the 
discrepancy process, for example, we would require th e first 26 EOFs, which would detract 



from a parsimonious formulation; \Hiadon et all (J2008aj) also suggest that a Gaussian process 



representation of high order basis function coefficients tends to perform poorly in terms of 
prediction. Here, El and e$ are independent mean zero normally distributed white noise 
error terms with variances r| and r|, respectively. The statistical model is completed by 
assuming the coefficient processes v e (t,9) and w e (t,9) are Gaussian processes. 

Based on the physical equations that define the total energy and number flux of precip- 
itating electrons for the MIX model, we impose a nontrivial mean function on the first low 
fidelity loading, v\. Utilizing the functional form of the total energy equation, e + e\\, we 
specify a nonlinear mean function 

Evi(t, 9) = 70 + 7ia + 72-Ra/« + 73Cos(27rt/n t ) + 7 4 sin(27rt/n t ). (6) 

The harmonics in the mean function are due to the nature of geomagnetic storms; as the CME 
passes over the Earth, the average background energy field increases in magnitude followed by 
a decay to the average background. The harmonics capture the physical temporal evolution 
of the geomagnetic storm over the period of our observations. We give the w\ loading process 
a constant mean parameter, allowing the variability of the discrepancy process across input 
setting to be captured by second order structures. For all e > 1, ~Ev e (t, 9) = Ew e (t, 9) = 0. 

All that remains to be specified are the covaria nce functions on the EOF load ing processes. 



We use a separable Matern correlation structure RGuttorp and Gneitincn . 120061 ) . The Matern 
correlation is defined as 

M„(tyA) = £4(|tyA|)K,,(| h/X\) 

where h G R, v > is the smoothness parameter and A > is the range parameter. The 
model correlation is 

C(ti, t2, 9\, 92] X a , X/3, Xr, At) = 

M 2 l^l) M 2 ( b^*) M 2 ( Rl 7 R2 ) M 2 ftl ~ h 



X a J \ Xp J \ Xr J V A 

where we fix the Matern smoothness at 2. A process with Matern correlation with a smooth- 
ness of 2 has realizations that are almost twice differentiable; in particular this imposed 



S 



Table 1: Parameters for the mean function of V\(t, 9) and separable Matern covariance func- 
tions for all EOF coefficient processes, as estimated by ordinary least squares and maximum 
likelihood, respectively. Ranges of a, (3 and R have been standardized to [0, 1] for this Table. 





7o 


7i 


72 


73 


74 


Ev 1 (t, 9) 


16.0 


180 


2804 


-0.201 


16.6 




a 


A Q 




\r 


Xt 


vi(t,9) 


11.2 


0.22 


0.19 


0.1 


0.051 


v 2 (t,e) 


88.8 


3.10 


0.08 


0.1 


0.248 


v 3 (t,9) 


80.1 


2.58 


0.24 


KT 3 


0.200 


Wl (t,9) 


24.5 


1.05 


0.58 


0.01 


0.067 


w 2 (t,9) 


18.7 


10~ 3 


0.03 


3.21 


0.046 


w 3 (t,9) 


16.9 


0.17 


10~ 6 


5.98 


0.035 


w 4 {t,9) 


15.3 


0.18 


1.52 


0.02 


0.028 



assumption aligns with the evolution of the geomagnetic storm across time, as a smoothly 
varying process. Secondly, numerical model output typically smoothly varies with input 
setting, and researchers in the computer experiments literature often use a Gaussian cor- 
relation function C(h) = exp(— \h\ 2 ), which coincides with the Matern class with infinite 
smoothness. However, it is well known that these Gaussian correlation functions lead to nu- 
merically poorly behaved covariance matrices, and in fact researchers often add an artificial 
ridge to the covariance matrix for stability. The smoothness of a spatial process is difficult 
to estimate, and using a fixed smoothness of 2 on the coefficient processes implies model 
output varies smoothly between input settings. The model is completed by specifying the 
covariance functions of the EOF loadings as 

Cov(v e (t 1 ,9 1 ),v e (t2,9 2 )) = alC(ti,h, 9i,9 2 ; X ae , Xp e ,XRe, X te ) (7) 

The same separable covariance model is assumed for the w e coefficients, but with distinct 
parameters. Notice that although we use a separable structure for the coefficient processes 
at each level of EOF, the final statistical model is not separable, but rather has a covari- 
ance function that is a weighted sum of separable c ovariances; this class of covariances 



is a type of well established pro duct-sum covariances We Cesare, Myers, and Posa\ , 12001 



De Iaco, Myers, and Posca . 1200 ll ) 



3.3 Estimation 



The main parameters of interest are the input parameters 9 = (a,f3,R), and all other 
statistical parameters, such as mean fu nction coefficients a nd covariance function ranges 
and variances, are of secondary interest. Wayarri et al\ (120071 ) argue that the uncertainty in 
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these secondary parameters is typically substantially less than the uncertainty in the input 
parameters, so that fixing the statistical parameters is justifiable in practice. In this light, 
we take an empirical Bayes approach to uncertainty quantification, where the mean function 
parameters of the EOF loading processes are estimated by ordinary least squares (OLS), 
and the remaining covariance function parameters are estimated by maximum likelihood 
(ML), conditional on the mean estimates. The observational error is taken to be 5% of the 
empirical standard deviation of energy observations, aligning with our collaborators' expert 
knowledge of the typical observational error for this type of dataset. 

Table [1] displays the OLS estimates of the mean function parameters and ML estimates 



of the s eparable Matern covariance function parameters. Recall the results oi lHiadon et al. 



( I2008af ) in that the inclusion of higher order principal component terms typically does not 
assist in prediction. As anticipated with a basis decomposition, the low order coefficients 
have more variability than the high order coefficients (noting that much of the variability of 
v\ is accounted for in the nonstationary mean function). The input parameters in Tabled] 
have been standardized to the unit interval to ease comparisons between input parameter, 
and we see that the greatest correlation for the low fidelity decomposition is across the a 
index, with (3 and R on the same order of correlation decay. The discrepancy function, on 
the other hand, tends to be more highly controlled by the R index, with a and /3 sharing 
approximately the same decay rate of correlation on average. This indicates that, while there 
is some information regarding contained in the energy model output, there is substantially 
more for a and R, which is expected, recalling the physical equations ([T]). 

Fixing the mean and covariance estimates, we impose independent uniform priors on a, (3 
and R, with uniformity over the bounding boxes described at the head of this Section. Define 
the following vectors: 

Y(t) = (Y( Sl ,t),Y(s 2 ,t),...,Y(s no ,t)y 
H(t, 9) = (H( Sl , t, 9), H(s 2 , t,9),..., H(s na ,t, 9))' 
L(t, 9) = (L(8i, t, 9), L{s 2 , t,9),..., L(s na ,t, 9))' 

where n = 170 is the number of locations of observations; note we implicitly order the 
observations and model output (and corresponding EOFs) such that the first n Q entries 
are the shared locations between the observations and model output, and the last n Q + 1 
to n s entries of H(i, 9) and L(i, 9) are the model output locations with no corresponding 
observations. Then combine these vectors into 

Y^Y^Y^)',...^^)')' 
H(9) = (U(t 1 ,9y,H(t 2 ,9)',...,H(t nt ,9)y 

L(9) = (L(t 1 ,9)',L(t 2 ,9) , ,...Mtn t ,0)y. 

Finally, combine the high and low fidelity vectors across input settings, 

H=(H(^)',H(^)',...,H(W 
L = (L(0 1 )',L(^) / ,...,L(M') / - 
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Then Z = (V, H', I/)' is viewed as a realization from the stochastic process defined by 
© an d ©• Conditional on the realization Z, the posterior distribution of 9 is 
sampled using a Metropolis-Hastings algorithm by block updating the vector 9 at each step. 
In particular, we use independent normal proposal densities centered at the current MCMC 
sample, with standard deviation one tenth of the standard deviation of the initial design 
points (over 9). 

Computation of the density of Z is difficult due to the large dim ension; for ou r initia l 
design and observations Z is of length 74 8,260. Utilizing Resul t 1 irom lHigdon et al\ ( 2008a ) 
alleviates this problem. In particular, \Higdon et all (j2008al ) suppose x ~ N(0, Ti x ) and 
£ ~ N(0, £ € ) are independent. Let Z = \Jx + £, and define /3 = (U'E^U) 
the likelihood function of Z can be written 



^'E^Z. Then 



L(Z) oc lE^-^lU'E^Uj-^exp f~Z'(I£ 1 - S- 1 U(U'E- 1 U)- 1 U , E^ 1 )Zj L0). (8) 



In our case, U is a block diagonal matrix of EOFs, with 1 + 5 + 20 blocks. The very 
first block corresponds to the observations and is itself a block diagonal matrix with n t 
identical blocks, each of which contains the truncated EOFs corresponding to the observation 
locations: 

' WLlOl) ••• ULn L { S l) M <5l( s l) W<5n 4 ( S l) > 

\U L l(s no ) ■■■ U L n L (s no ) U S i(s no ) ••• U Sn5 (s no ) J 

so that the first block of U has dimension (n t x n a ) x (n t x (til + ns)), in our case (18 x 
170) x (18 x (3 + 4)) = 3060 x 126. The next 5 blocks of U correspond to the high resolution 
model output, and again contain nt blocks of EOF matrices, each of which is 

' Uli(sx) ■■■ U LnL (Si) Usx(si) ■■■ USn s (si) \ 

\u L i(s ns ) ■■■ u LnL (s ns ) u 51 (s ns ) ■■■ u Sns (s ns ) J 

Hence each of these 5 blocks of U is of dimension (n t x n s ) x {n t x {n L + rig)), in our case 
29808 x 126. The final 20 blocks of U correspond to the low fidelity model output, each of 
which is a block diagonal matrix consisting of n t blocks the following EOF matrices: 

/ Un(Sl) ••■ ULn L (si) \ 
\ U L1 (s ns ) ■ ■ ■ U LnL {s ns ) j 

Thus, each of the last 20 blocks of U is of dimension (n t x n s ) x (n t x til), in our case 
29808 x 54. 
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The entries of x are EOF weights v e (t,9) and w e (t,6). As with the matrix U, it is 
convenient to divide x into 1 + 5 + 20 segments. The first segment consists of the observation 
EOF coefficients 



( v (*i, e y, w(ti, e y, v(t nt , e Q y, w(t nt , e Q yy. 



where 



v(t,e) = ( Vl (t,e),...,v nL (t,e)y 

w(t,0) = { Wl {t,6),...,w ns {t,6)y. 
The following 5 segments correspond to the high fidelity runs, each of which consists of 

(v(*i, e p y, w(h, e p y, . . . , v(t nt , e p y, w(t nt , e p yy 

for p = 1, . . . , 5. The final 20 segments correspond to the low fidelity runs and consist of 

(v(t 1 ,^)',...,v(i nt ,W 

for p = 1, . . . , 20. Note that Result 1 oi \Higdon et al. fl2008af ) requires x be centered at zero; 
to this end we apply Result 1 to Z — UEcc = XJ(x — Esc) + 

Similar to U and x, we break up 1 + 5 + 20 segments of The first fit x n have variances 
t 2 +t|+t| ; the following 5xn t xn s have variances t|+t| and the remaining 20xn t xn s entries 
have variance s r|. This completes our model's formulation of the likelihood decomposition 
of Result 1 oi \Higdon efdl fl2008ah . 

Exploiting the EOF decomposition of the model output dramatically reduces dimension- 
ality of the problem. For example, a typical Gaussian process approach to our setup would 
require inverting a matrix of dimension 748, 260 x 748, 260, whereas, for example, inverting 
U'S e U is feasible, as it is a matrix of dimension 1836 x 1836. 



4 Results and Sequential Design 
4.1 Initial Calibration 

Initially, we begin by running five independent chains of posterior samples simultaneously, 
from random starting values. The posterior samples based on the initial design are shown in 
Figure [T]as small black dots. Notice the distribution is multimodal, and there is an apparent 
nonlinear inverse relationship between a and R. In fact, the curve along which the posterior 
samples fall for (a, R) define a posterior distribution of total energy. Recall Equation (JB]), 
where we exploited the functional form of total energy, of a form a + Ry^a. These results 
suggest that the quantity of total energy is well defined based on our observations and initial 
design, and a combination of pairs of input parameters (a, R) that approximately yield this 
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Figure 1: Posterior samples using only the five high fidelity runs (small grey dots), and using 
the entire initial design of five high fidelity and 20 low fidelity runs (small black dots) with 
input pairs at which the low fidelity model was run (unfilled circles) and input pairs at which 
both low and high fidelity models were run (filled circles). 

total energy are appropriate for our dataset. Notice that /3 is not especially well identified 
based on our observations. This is expected, as we currently are modeling only energy, and j3 
is a controlling parameter for flux, although the information in the energy variable regarding 
(3 is not negligible. 

Let us illustrate the benefit of using the low fidelity model in conjunction with the high 
fidelity model. If there were no extra information added by including the low fidelity model 
output, we would expect the posterior samples based exclusively on the high fidelity version 
to be the same as including both model fidelities. The small grey dots of Figure [I] are 
posterior samples for the input parameters based on only the five high fidelity runs, here 
ignoring the 20 low fidelity runs. In particular the statistical model remains the same, except 
where we write 

H(s, t,6) = J2 u He{s) v e (t, 9) + e H (s, t, 9) (9) 

e=l 

where nn = 3, and Kvi(t, 9) has the same functional form as ([6]). Comparing the two sets 
of posterior samples in Figure [I] shows the gain in augmenting the high fidelity runs with 
the low fidelity information - the location of the curve in panel (b) for the pair (a, R) is 
adjusted downwards when also using the low fidelity runs and a posterior mode is ruled out. 
Specifically, the posterior mode about (a, R) ~ (0.35, 0.01) is no longer present. Hence, our 
posterior uncertainty regarding the parameters a and R has decreased due to the inclusion 
of the low fidelity output. The posterior samples for (3 are slightly adjusted when the low 
fidelity information is included, although not necessarily the same amount as for a and R, 
again, due to the fact that /3 is linked to flux. 

There are two potential explanations for the multimodal nonlinear behavior of the poste- 
rior distribution shown in Figure [T^b). The first is that the observations have no information 
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regarding the specific pair of (a, R) that is optimal, or alternatively the curve is an artefact 
of the sparse initial design. In particular, with only 5 runs of the high fidelity model, it 
is unlikely that the discrepancy function S(s,t,9) has been well estimated, and given more 
runs of the LFM-MIX, the posterior distribution may shrink to one of the modes of Figure 
CD To this end, we develop a sequential design based on expected improvement. 



4.2 Expected Improvement for Sequential Design 

We seek to perform an additional run of the LFM-MIX based on current information, and 
expected improvement (EI) is one approach to sequential design that incorporates accuracy 
and uncertain ty. Expected imp rovement was originally developed for black-box function 



optimization Uones et all Il998f ). but we adjust the idea for our purposes of parameter 
identification. To begin, we define the improvement function for a given location and time 
as minimizing the squared residual between the high fidelity model output and observations: 

I(s, t, 9) = max {/ min - (Y(s, t) - H(s, t, 6))\ 0} (10) 

where f min — min^ =1 (Y(s, t) — H (s , t , $«)) 2 is the observed minimized squared residual over 
the initial runs of the LFM-MIX. The EI is defined as a sum of expected improvement 
functions over all locations and times, 

EI(6) = ^EI(s,t,6), (11) 

s,t 

and is a function only of input parameter 9. 

To write the closed form of EI at an arbitrary setting 9, we require the conditional 
distribution of the high fidelity model, given the current runs. In particular, we have 

H(s, t, 9) | {H(s, t, {L(s, t, 0i)}Zi ~ N(tf, a 2 ) (12) 

where H and a 2 are simply a conditional mean and variance of the multivariate normal 
defined by Equations fl3]), fll]) and (JSJ). Let Q± = (Y — H ±y/ f min )/a, we simplify notation by 
setting Y = Y(s,t) and <fi and $ are the standard normal density and cumulative distribution 
functions, respectively. Then the expected improvement at location s and time t has closed 
form 

EI(s,t,9) = (f min -{Y- H) 2 - a 2 )($(Q+) - $(£_)) 

+ ° (( v 7 ^ + H - Y)<f>(Q+) + ( yff~ + Y- H)<P{Q-)) . (13) 

See the Appendix for a derivation. Notice that EI is indeed a weighting between uncertainty 
(a) and accuracy ((Y — H) 2 ). For example, if, at a new setting 9, our predictive variance 
for the high fidelity model output were small, then the latter term of f|T3|) will be negligible, 
and the EI will be controlled by the accuracy in the first term as a function of (Y — H) 2 . 
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Figure 2: Expected improvement surface, with initial posterior samples (small dots) based 
on initial design over 6 with input pairs at which the low fidelity model was run (unfilled 
circles) and input pairs at which both low and high fidelity models were run (filled circles). 

Figure |2] shows the EI surface as a function of f3 and R for the best value of a (0.5). 
As previously, the open circles are locations at which we ran the low fidelity model, and 
the closed circles are the locations at which we ran both fidelities. There are a number of 
interesting features illustrated by this surface. The EI surface is multimodal, with the most 
pronounced mode at (p,R) = (2.5,0.068), falling directly between two modes of the initial 
posterior samples. In this area, the uncertainty is substantial enough that an optimum may 
be in the area. Note there are no high fidelity model runs in the immediate area; that the EI 
maximum also falls directly between two posterior sample modes indicates that EI is indeed 
a weighting between uncertainty and accuracy. EI is sensitive to the initial design, at most 
of the locations where the low or high fidelity model was run, there are relatively low values 
of EI, as we have already reduced our uncertainty in those areas. However, the EI surface 
also follows the general trend of the initial posterior samples, indicating our initial samples 
fell in areas of high model accuracy. 

We ran the high and low fidelity version of the LFM-MIX at the greatest mode indicated 
by the EI surface, specifically at (a,(3,R) = (0.5,2.5,0.068), and conditional on this ad- 
ditional run, sampled from the posterior distribution of the input parameters. If no extra 
information were added due to the sequential design run, we would see the same posterior 
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Figure 3: Second round of posterior samples (small dots) based on initial design plus the run 
suggested by the expected improvement criterion with input pairs at which the low fidelity 
model was run (unfilled circles) and input pairs at which both low and high fidelity models 
were run (filled circles). 

samples as in Figure [TJ The second round of posterior samples, conditional on the initial 
design plus the single additional run suggested by EI, are shown in Figure El The substantial 
change between Figures [1] and [3] can be seen in the third panel (c), the pairwise posterior 
samples for (3 and R. In particular, the upper leftmost mode that was present in Figure 
□Jc) has been ruled out now, as there are no posterior samples in this area. Our posterior 
uncertainty has decreased due to the single additional run suggested by EI. Our information 
regarding R has also increased due to the added EI run, as the initial middle mode about 
R = 0.7 has now split into two smaller modes. 

In previous experiments with the LFM-MIX, continuing sequential design based on EI im- 
proves the posterior distribution of (a, R) slowly, and primarily explores the three-dimensional 
(a,(3,R) space over (3. This reiterates the substantial uncertainty in (3 based on the energy 
variable alone, and unfortunately, due to the high budgetary demand of running the LFM- 
MIX, at 100 hours for each run of the high and low fidelity model on 8 processors, it is 
not within our current budget to continue the sequential design. Future work is aimed at 
including observations for flux, which we anticipate greatly improving identification of (3. 



5 Parameter Estimation for the Lorenz '96 



In the previous section, we outlined a statistical model for combining high and low fidelity 
model output for large spatio-temporal datasets with an application of quantifying the un- 
certainty in input parameters for the LFM-MIX computer model. The initial posterior 
distributions illustrated a strong nonlinear relationship between the parameters a and R, 
and based on a sequential design framework, we saw the posterior distributions shrink in 
variability, ruling out an area of the parameter space present in the the initial multimodal 
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posterior distribution. In this section we illustrate a similar statistical model using a phys- 
ical model with known truth. The goal in this section is to compare our ability to identify 
model parameters using the EOF approximation model with differing initial design sizes, 
and to assess the ability of sequential design under expected improvement in improving the 
posterior estimates of unknown parameters. 

The Lorenz '96 system (hereafter L96) of equations was developed b y Edward Lor enz to 
be a simplified one- dimensional atmospheric model that exhibits chaos f Lorenz . 19961 ). The 



physical model is for 40 variables (known as state variables in the atmospheric sciences). For 
variable Y(s, t), location s = 1, ... ,40 and time t, we have 

dY(s, t)/dt = -Y(s - 2, t)Y(s -l,t)+Y(s-l, t)Y(s + 1, t) - Y(s, t) + F(s) (14) 

where F(s) is a location dependent forcing term, and Y (s, t) is available at an y intege r value 



of s by setting Y(s — 40, t) = Y(s + 40, t) = Y(s,t). For the forcing term, \Lorenz\ ([1996) 
used F(s) = 8, but for our purposes we wish to mimic the behavior of the LFM-MIX using 
this reduced atmospheric model. 

Analogous to the LFM-MIX case, we have two forcing functions, corresponding to a low 
and a high fidelity simulator. In particular, we respectively define the low and high fidelity 
forcing functions as 

F L (s;a,b) = 8 + a + 3a6exp(- cos(2vrs/40))/ exp(l) (15) 
F H (s;a,b) = 8 + a + 3a6exp(-10cos(27rs/40))/exp(10). (16) 

Notice the functional form here, a + ab, is akin to the total energy equation of the LFM-MIX, 
which was of the form a + R^fa. 

Fixing true values of a and b at 1/2 and 3, respectively, the first panel of Figure H] shows 
the corresponding forcing functions for the low and high fidelity versions. Notice the low 
fidelity version appears to smear out the peak defined by the high fidelity forcing function. 
This is akin to the relationship between the differing fidelities of the LFM-MIX, where the 
low fidelity model tends to produce output that is a (spatially) less peaked version of the 
more peaked high fidelity model output. 

The observations are generated from the high fidelity version of the L96, based on 40 
independent initial unit uniform random variables. Solving the equations every 6 hours, we 
run the L96 for 300 years, and use 30-year averaged output, garnering approximate climate 
of the L96. The motivation for time-averaging is that each single realization from the L96 
is highly erratic, as seen in Figure HJ whereas taking time-averages over long periods tends 
to reproduce the forcing function, also displayed in Figure HJ To these 10 time realizations, 
we add independent normal errors for each variable at all time points, whose mean is zero 
and whose standard deviation is five percent of the empirical standard deviation of the 
model output, again to line up with the expert understanding of measurement error for the 
LFM-MIX example. 
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Figure 4: Illustration of the Lorenz '96 model. Forcings for the low and high fidelity versions, 
physical model realizations and a 30- year averaged run. Forcings correspond to a = 1/2 and 
b = 3. 

We suppose it is known that a G [0,2] and b e [0,5]. To explore different design ap- 
proaches, we run two initial designs. The first design assumes greater resources than are 
available for the LFM-MIX. In this situation, we run the low fidelity model at 40 pairs of 
input settings based on a space-filling design, and the high fidelity model at a space-filled 
subset at 20 points of the original 40. This setup is designed is to illustrate our ability 
to tune model parameters in the situation with more resources than are currently available. 
The second design utilizes a space-filled subset of 20 runs of the low fidelity computer model, 
with an additional 5 runs of the high fidelity version, aligning directly with our setup for the 
LFM-MIX scenario. 

To align with the LFM-MIX modeling approach, we suppose the observations are ade- 
quately represented by the high fidelity version of L96, up to white noise. In particular, 
using similar notation as in the previous section where 9 = (a, b), we write 

Y(a,t)=H(a,t,e )+e(a,t) (17) 

where e(s, t) is a white noise process, which we assume to be normally distributed with mean 
zero and variance r 2 . As with the LFM-MIX, we link the low and high fidelity models with 
an additive discrepancy function S(s,t,9), where 

H(s,t,9) = L(s,t,0) + 5(s,t,9). (18) 

Whereas the LFM-MIX is highly multivariate, our L96 example does not require the 
same dimension reduction techniques employed earlier. Although not required, we use similar 
modeling techniques to those employed for the LFM-MIX above in order to explore our ability 
to identify physical parameters in a setting where approximations are required. Hence, we 
write 

L(s, t,9) = J2 u Le{s) v e (9, t) + e L (s, t, 9) 

e=l 
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and 



5(s, t,9) = y^ y u Se (s) w e (9, t) + e s (s, t, 9). 

e=l 

Putting til = 2 and ng — 1 (capturing more than 99% of the variability), the residual 
processes El and e$ are modeled as normally distributed white noise terms with variances 
r| and r|, respectively. As in the LFM-MIX case, we model vi,v 2 and w\ as Gaussian 
processes. Each is endowed with a mean function of the form 70 + 71a + ^by/a, a functional 
form that was decided upon after elementary data analysis; notice we find similar behavior 
to the a + ah form of the forcing functions (I15p and (JUJ). Unlike the LFM-MIX, we suppose 
the v and w processes are independent across time; indeed with the L96, we consider long 
term averages, and viewing the realizations as independent across time is justifiable, whereas 
in the LFM-MIX case, our realizations arise from a continuous process over a relatively short 
time interval. The functional form of the covariance for the v and w coefficient processes is 
cr 2 C(#i, 6* 2 ; A a , Xb) where 9 = (a, b), and 

C(9 X , 9 2 - X a , X b ) = M 2 (j^) M 2 (hj^j , 

where naturally each v\ y v 2 and Wi has distinct covariance and regression parameters. 

For physical parameter estimation, we sample the posterior distribution of 9 condi- 
tional on Z, which is made up of the following components. Define the vectors Y(tj) = 
(Y(8 1 ,t i ),Y(8 2 ,t i ),...,Y(8 ns ,t i )y, H(t<) = (#(«!, * t , 00, H{s 2: t i: 9 x ), ...,H{a n „U,9 nH )y, 
and L(ti) = (L(si,ti, 0i), L(s 2 , t i: 9i), . . . , L(s ng ,ti, 9 nL ))', where the number of low and high 
fidelity samples are and n#, respectively. Combine these vectors into the single time 
point vector Z(t { ) = (Y^)', H^)', L(tj)')', then Z = (Z^)', • • • , Z(t„ t )')'- 

Posterior distributions are shown in Figure (5J with the truth indicated by the intersection 
of solid lines. We consider three cases for posterior sampling - the first is based on a dense 
design of ul =40 and njj = 20, shown in panel (1). The posterior distribution covers the 
truth, but is spread over a swath of plausible values, falling along a curve of the form a + by/a, 
exhibiting similar behavior as the LFM-MIX; note the substantially larger initial design size, 
however. The posterior mode is at approximately (a, b) = (0.51,3.09), indicating accurate 
point estimation, but still displaying substantial uncertainty. 

The middle panel of Figure [5] replicates the situation of the LFM-MIX more closely in 
that we use only Ul = 20 and n# = 5 points in the initial design. The posterior distribution 
covers the true value of (a, b), and again we see a swath of density following a curve similar 
to a + by/a. Here, however, the posterior mode is at (a, b) = (0.40, 3.94), so while the truth 
is indeed captured within the posterior samples, there appears to be some bias. Following 
this sparse initial sample, we run both low and high fidelity version of the L96 at seven 
additional input settings chosen sequentially based on the expected improvement criterion. 
The final panel of Figure |5] displays the posterior distributions based on these til = 27 and 
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Figure 5: Parameter turning the Lorenz '96 model. True parameter values are (a, b) = 
(1/2,3), indicated by the intersection of two solid lines. Each panel contains posterior 
densities with contours overlying posterior samples for (1): large initial design, (2): sparse 
initial design similar to the LFM-MIX, (3): sparse initial design with seven additional runs 
chosen sequentially by expected improvement. Input settings at which the low fidelity model 
was run are displayed as circles both filled and unfilled, and settings where the high fidelity 
model was also run are shown are filled circles. 

Uh = 12 samples. Indeed the posterior variability has decreased as compared to that based 
on the initial design, but also notice that the posterior has substantially less variability 
than the dense initial sample of panel (1). These results suggest we can perform fewer runs 
initially, and rely on a sequential design such as expected improvement to home in on the 
true values. The posterior mode after sequential design is approximately (a, b) = (0.52, 2.96), 
indicating accurate posterior estimation. An interesting note is that the final posterior 
distribution displays three distinct modes (although the mode about the truth is of higher 
posterior density). Given that the sequential design runs cover the posterior modes, we do 
not anticipate the posterior distribution improving greatly, but reiterate that the posterior 
distribution contains and is indeed centered about the truth. 

6 Discussion 

We have introduced an approach to quantify the uncertainty about input parameters for 
large spatio-temporal datasets with high and low fidelity model output. We suppose the 
high fidelity model is an adequate representation of reality at some unknown set of input 
parameters up to white noise. The high and low fidelity models are linked through an 
additive discrepancy function. This link allows us to run the higher cost high fidelity model 
at fewer sets of input parameters, and explore the input setting space with the cheaper 
low fidelity model. In our first example we examined the LFM-MIX model for geomagnetic 
storms occurring in Earth's near space environment, which is partially parameterized by three 
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unknown input parameters controlling energy and flux. Based on an initial experimental 
design, using observations of energy, we discovered a nonlinear relationship between a subset 
of the input settings, which was a level curve for the total energy quantity. One input setting 
was not well identified, but considering that particular variable contributes mainly to flux, 
it is unsurprising that it is not well identified using only energy observations. 

To improve posterior estimation, we developed an expected improvement criterion for 
sequential design. The improvement function seeks to minimize squared distance between 
the high fidelity model and observations. We derived the closed form for EI over arbitrarily 
many locations and times, which simultaneously weights uncertainty and accuracy. Based 
on the EI criterion, we performed an additional run of the LFM-MIX and found that the 
posterior distributions for the input parameters indeed shrunk in width. This suggests that 
the nonlinear behavior of the initial posterior distribution is potentially an artefact of our 
sparse initial design. Comparing these results to the contrived Lorenz '96 system with known 
truth, we would anticipate some improvement manifesting as smaller posterior variability if 
we were to continue sequential design based on EI, with the posterior mode eventually settling 
around the true unknown parameter value. 

In a previous set of experiments, we explored sequential design based on EI, and found 
that the criterion primarily becomes overwhelmed by the uncertainty surrounding the input 
parameter involving flux. Due to the high budgetary demand of running the LFM-MIX, it 
is not within our current capacity to continue the sequential design. Our current research is 
aimed at including observations for flux, which we anticipate greatly improving the posterior 
distributions of all three input parameters. 

We reduced dimensionality of the large dataset by projecting spatial fields onto empirical 
orthogonal functions; the motivation was driven by exploratory analysis where the first main 
mode of spatial variation exhibited a magnitude with functional form similar to physical 
equations governing energy and flux for the LFM-MIX. In other contexts for other space- 
time computer models, a different approach may be required. For instance, if the model 
output is a highly nonlinear response of input parameters, a principal component approach 
is likely to be unsuccessful in statistically modeling physical model output. In such cases the 
practitioner may need to pe rfor m statistical tests for space-time s epara bility, such as those 



developed by \Fuentes\ (120061 ) oi lMitchell, Genton, and Gumpertz\ ( 120051 ). 



The clearest route of future research is to develop a bivariate model for energy and flux, 
which will allow us to simultaneously identify the three parameters controlling these two 
distinct variables. One potential solution to this added complication is to use a similar EOF 
decomposition for flux, and use a multivariate Gaussian process representation for the EOF 
coefficient processes for both energy and flux, thereby accounting for correlation between the 
two distinct variables. 

The statistical model did not account for systematic model bias. Our approach is consis- 
tent with the mathematical formulation of solving large scale inverse problems using com- 
puter models and o bserved data (see, e.g. the cosmic microwave background application in 



Hiadon et a/.ll201ll ). With only one observed geomagnetic storm, model bias is confounded 
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with the residual process; with multiple storms we could potentially include a full bias term 
across space and time. However, it is believed by space physicists that the infinite resolu- 
tion version of the LFM-MIX is unbiased, and our high fidelity version is an approximation 
to this infinite resolution. The discrepancy function we introduced connected the low and 
hi gh fidelity versions of the mod el, which is notably different than the original suggestion 
of I Kennedy and O'Hagam (120011 ) of including an additive model discrepancy term. In our 
situation, we have only one realization of the spatio-temporal process, and hence model bias 
is unide ntifiable with o ut so me simplifying assumptions (such as constancy across time or 
space). \Heaton et all (120131 ) al so examines the LFM -MIX, taking a predictive process ap- 
proach to dimension reduction (iBanerjee et all [20081) , and assumes a rotational bias across 
time. That is, the authors assume there is an unknown spatial rotation at each time point 
that defines model bias for the high fidelity version. Their posterior distributions differ from 
those found herein, generally centering on approximately (a,/3,R) = (0.47, 1.59,0.02). This 
is not contradictory to our results in that the assumptions regarding model bias are different 
- indeed, optimal parameter values under rotated model output are expected to be different 
than those under no such rotations. With additional geomagnetic storms, our goal is to 
determine the need for such rotations and potentially fully general space-time model biases, 
but it is currently unclear which of these competing assumptions is necessary. 

The low and high fidelity versions of the LFM-MIX are generated by differing resolutions 
of the LFM model. While in the current work we used only two resolutions, there is potential 
for a higher resolution available that is extremely computationally intense, and must be run 
on a supercomputer on at least 32 processors. Potentially, one way to include this 'highest' 
fidelity is to maintain our model's formulation, and write the high fidelity model as a sum of 
the highest fidelity and a secondary discrepancy function. It is likely that the discrepancy 
connecting the lower fidelities will be correlated with the discrepancy connecting the higher 
fidelities, and hence we anticipate requiring a multivariate Gaussian process model for the 
discrepancy processes. 



Appendix 

In this appendix we derive the closed form for the expected improvement at a single location 
s and time t, equation (JT3"]) . For notational simplicity, write Y(s, t) = Y, H(s, t,9) = H and 
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fmin = /■ Then we have 

EJ(s, t, 0) = E max {/ - (Y - H) 2 , 0} 

= / (f~(Y- Hf) L(H)dH 

OJf>{Y-Hy \ ° J 

= I _ ( f - (Y - H - ax) 2 ) (j){x)&x 

J Y-VJ-H <x Y+VJ-H \ J 

= [ Q+ (f - (Y - H) 2 ) (p(x)dx + 2a(Y - H) [ + X(f)(x)dx 
JQ- V ' J Q- 

— (j 2 I x 2 (p(x)dx 
=A+B+C 

utilizing the change of variables x = (H — H)/a. The three integrals of A, B and C can be 
written 

A= [f-iY-H) 2 ) ($(Q + )-$(Q_)) 

B = 2a(Y-H) (0(Q_)-0(Q + )) 

C = -a 2 (Q_0(Q_) - Q+<KQ+) + $(Q+) - HQ-)) 

using integration by parts and the fact that the antiderivative of x(j)(x) is —<j>{x). Combining 
terms yields (|T3|) . 
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